Credit: IrenaR / Shutterstock

What’s next for generative AI: Household chores and more

Mar 7, 2024

Why It Matters

Two AI practitioners explain how “Large X models” that turn text into actions may ultimately allow generative AI to water plants and peel potatoes.

Generative AI can create novel and realistic content such as text, images, music, video, and code. It is already transforming the digital world, but what about the physical world? How will generative AI help us with tasks that involve machines and robots, such as watering our houseplants or assisting with industrial manufacturing?

In a recent study by researchers from the University of Oxford, artificial intelligence experts predicted that up to 40% of household chores — primarily housework like cooking, cleaning, and doing laundry — will be automated within the next 10 years. What technological developments are necessary for everyday tasks to be transitioned over to AI?

The concept of Large X Models

A key idea behind generative AI is the use of foundation models. We’ve coined a phrase for these types of foundation models — Large X Models, or LXMs — where X is the training data category required to give the AI general-purpose capabilities in a specific space. For language models, X is text data; for machine models, X is machine data; and for action models, X is human action data.

Why generative AI needs a creative human touch

LinkedIn’s chief economist: Gen AI will impact ‘solidly middle-class’ workers

A framework for assessing AI risk

The opportunities for LXMs — including existing chat models, potential physical models assisting with household chores, and heavier lifts such as industrial manufacturing — are endless.

A well-known example of a foundation model, the large language model, is trained on a massive amount of text data and can generate coherent and fluent text for various applications.

Similarly, a Large Sensor Model is trained on swaths of sensor, process, and machine event data across a large variety of machines, processes, products, and sensors. Large Sensor Models could be used to monitor, diagnose, and optimize industrial machines and processes and generate new designs and configurations.

Along the same lines, a Large Behavior Model trained on data derived from videos of humans doing physical tasks could be used to teach robots how to perform various chores and activities — such as watering plants or peeling potatoes.

The promise of LXMs

Though traditional AI has been applied to reduce machine failures and manufacturing defects, and to improve manufacturing efficiency in terms of raw materials and energy consumption, progress so far has been challenging.

Up to 40% of household chores will be automated within the next 10 years, researchers from the University of Oxford predict.

LXMs will accelerate advancements in these areas. A Large Sensor Model would not only be generally usable out of the box but also easier for factory operators to use if a natural language interface were added on top of it via an LLM. Right away, such a construct removes the friction points of data, accuracy, and adoption.

Natural language-driven, zero-downtime, zero-defect factories with reduced energy consumption sound cool, but maybe all you’re looking for help with is making coffee or other domestic tasks. Toyota Research Institute is using diffusion (a generative AI technique used for popular text-to-image applications) to teach robots to peel vegetables, among many other tasks. Tasks that took months to teach machines using programming now take an afternoon using generative AI.

Brett Adcock, founder of AI robotics company Figure, thinks we’ll have access to those benefits by 2030. He points out the business attractiveness of tackling industrial labor use cases before addressing the technical complexity of home environments. We agree that it’s a matter of years, not decades, before we see deployments in these areas. And we like that Figure is not the only company using elbow grease to save humans from having to do some manual tasks; there are others, and they are leveraging generative AI to power through some hard problems in the physical space.

Enter general-purpose robots powered by LXMs

Machine automation has long been a driver of productivity gains in the major areas of physical work: agriculture, manufacturing, construction, logistics, maintenance, domestic work, hospitality, and health care. Industrial robots are another driver of productivity in physical work. Whereas those have traditionally been programmed for specific tasks, smart and autonomous robots have emerged primarily in the past decade, with prevalent use cases in logistics.

In the past two years, there is a wave of general-purpose, humanoid robots such as those from Agility, Boston Dynamics, Figure, Prosper, Sanctuary, and Tesla. Prosper claims that it is making a robot called Alfie, a robotic helper for home or office. Alfie can clean, organize your things, and take care of small chores, such as watering plants.

We see a future where there are general-purpose robots for a broad range of tasks, powered by multiple LXMs. Carnegie Mellon University robotics researchers have already enabled robots to learn household chores by watching videos of people performing everyday tasks in their homes.

The path forward

These kinds of activities are becoming more conceivable by the day. New research presented by the robotics team at DeepMind, Google’s AI lab, describes how the use of powerful LLMs covering automation, reaction times, and motion tracking allowed their robots to learn about and understand complex tasks. To show the potential of these findings, they even produced models of robots opening and closing drawers, removing soda cans from countertops, and moving items around.

Overall, the technological possibilities with generative AI are awe-inspiring. While we are still in early innings with LLMs, companies have been and continue to develop models, such as Toyota’s Large Behavior Model, Google’s general-purpose RT-X model, and Runway’s general world models.

By using such models, robots are able to better understand their environments and the dynamics of operating within them. This enables the opportunity for more realistic human behavior, resulting in the execution of small automated tasks, such as plant watering and potato peeling, and eventually expanding into more complex and laborious tasks, such as industrial manufacturing. While the road to this transformation is long and winding, it will one day be paved by these LXMs.

Atin Gupta is vice president of strategy and innovation at BuzzBoard.ai. Geoffrey G. Parker is a professor of engineering innovation at Dartmouth College and a research fellow and visiting scholar at the MIT Initiative for the Digital Economy.